Goto

Collaborating Authors

 communication constraint



Optimal testing using combined test statistics across independent studies

Neural Information Processing Systems

Combining test statistics from independent trials or experiments is a popular method of meta-analysis. However, there is very limited theoretical understanding of the power of the combined test, especially in high-dimensional models considering composite hypotheses tests. We derive a mathematical framework to study standard meta-analysis testing approaches in the context of the many normal means model, which serves as the platform to investigate more complex models. We introduce a natural and mild restriction on the meta-level combination functions of the local trials. This allows us to mathematically quantify the cost of compressing m trials into real-valued test statistics and combining these. We then derive minimax lower and matching upper bounds for the separation rates of standard combination methods for e.g.



398475c83b47075e8897a083e97eb9f0-Supplemental.pdf

Neural Information Processing Systems

We revisit first-order optimization under local information constraints such as local privacy, gradient quantization, and computational constraints limiting access to a few coordinates of the gradient. In this setting, the optimization algorithm is not allowed to directly access the complete output of the gradient oracle, but only gets limited information about it subject to the local information constraints. We study the role of adaptivity in processing the gradient output to obtain this limited information from it. We consider optimization for both convex and strongly convex functions and obtain tight or nearly tight lower bounds for the convergence rate, when adaptive gradient processing is allowed. Prior work was restricted to convex functions and allowed only nonadaptive processing of gradients. For both of these function classes and for the three information constraints mentioned above, our lower bound implies that adaptive processing of gradients cannot outperform nonadaptive processing in most regimes of interest. We complement these results by exhibiting a natural optimization problem under information constraints for which adaptive processing of gradient strictly outperforms nonadaptive processing.


Order-Optimal Sequential 1-Bit Mean Estimation in General Tail Regimes

arXiv.org Machine Learning

In this paper, we study the problem of mean estimation under strict 1-bit communication constraints. We propose a novel adaptive mean estimator based solely on randomized threshold queries, where each 1-bit outcome indicates whether a given sample exceeds a sequentially chosen threshold. Our estimator is $(ฮต, ฮด)$-PAC for any distribution with a bounded mean $ฮผ\in [-ฮป, ฮป]$ and a bounded $k$-th central moment $\mathbb{E}[|X-ฮผ|^k] \le ฯƒ^k$ for any fixed $k > 1$. Crucially, our sample complexity is order-optimal in all such tail regimes, i.e., for every such $k$ value. For $k \neq 2$, our estimator's sample complexity matches the unquantized minimax lower bounds plus an unavoidable $O(\log(ฮป/ฯƒ))$ localization cost. For the finite-variance case ($k=2$), our estimator's sample complexity has an extra multiplicative $O(\log(ฯƒ/ฮต))$ penalty, and we establish a novel information-theoretic lower bound showing that this penalty is a fundamental limit of 1-bit quantization. We also establish a significant adaptivity gap: for both threshold queries and more general interval queries, the sample complexity of any non-adaptive estimator must scale linearly with the search space parameter $ฮป/ฯƒ$, rendering it vastly less sample efficient than our adaptive approach. Finally, we present algorithmic variants that (i) handle an unknown sampling budget, (ii) adapt to an unknown scale parameter~$ฯƒ$ given (possibly loose) bounds, and (iii) require only two stages of adaptivity at the expense of more complicated general 1-bit queries.